Radiomics nomogram for the preoperative prediction of lymph node metastasis in pancreatic ductal adenocarcinoma

Purpose To develop and validate a radiomics nomogram for the preoperative prediction of lymph node (LN) metastasis in pancreatic ductal adenocarcinoma (PDAC). Materials and methods In this retrospective study, 225 patients with surgically resected, pathologically confirmed PDAC underwent multislice computed tomography (MSCT) between January 2014 and January 2017. Radiomics features were extracted from arterial CT scans. The least absolute shrinkage and selection operator method was used to select the features. Multivariable logistic regression analysis was used to develop the predictive model, and a radiomics nomogram was built and internally validated in 45 consecutive patients with PDAC between February 2017 and December 2017. The performance of the nomogram was assessed in the training and validation cohort. Finally, the clinical usefulness of the nomogram was estimated using decision curve analysis (DCA). Results The radiomics signature, which consisted of 13 selected features of the arterial phase, was significantly associated with LN status (p < 0.05) in both the training and validation cohorts. The multivariable logistic regression model included the radiomics signature and CT-reported LN status. The individualized prediction nomogram showed good discrimination in the training cohort [area under the curve (AUC), 0.75; 95% confidence interval (CI), 0.68–0.82] and in the validation cohort (AUC, 0.81; 95% CI, 0.69–0.94) and good calibration. DCA demonstrated that the radiomics nomogram was clinically useful. Conclusions The presented radiomics nomogram that incorporates the radiomics signature and CT-reported LN status is a noninvasive, preoperative prediction tool with favorable predictive accuracy for LN metastasis in patients with PDAC. Supplementary Information The online version contains supplementary material available at 10.1186/s40644-021-00443-1.


Introduction
Pancreatic cancer is a highly lethal disease, and its mortality closely parallels its incidence [1,2]. Surgical resection is regarded as the only potentially curative treatment and can result in significantly longer survival than other treatment options. Lymph node (LN) metastases are observed in 70% or more of resected ductal adenocarcinomas and are present even when the primary tumor is small (< 2 cm) [3]. LN variables remain some of the most important individual predictors of survival. A reliable method to obtain accurate LN results is postoperative pathology. However, this technique is limited in detecting LN metastasis during preoperative staging.
Endoscopic ultrasonography-guided fine-needle aspiration (EUS-FNA) is considered quite sensitive for detecting LN metastases from pancreatic lesions, but EUS-FNA is an invasive diagnostic tool that is expensive and time consuming and has a rather significant risk of complications [4,5]. Several factors limit the use of magnetic resonance imaging (MRI) for determining LN status in clinical cohort settings, including spatial resolution problems, motion artifacts and dose-dependent oversaturation artifacts [6]. Multislice computed tomography (MSCT) is the best initial diagnostic test for pancreatic cancer. However, a meta-analysis that investigated CT for assessing extraregional LN metastases in pancreatic and periampullary cancer yielded a pooled sensitivity of 25% and a positive of predictive value (PPV) of 28% [7]. Important clinical objectives, including differentiation of reactive, inflammatory lymphadenopathy from malignant lymphadenopathy and detection/visualization of micrometastases, were not achieved with this technique.
Radiomics is an emerging field that converts imaging data into a high-dimensional mineable feature space using a large number of automatically extracted data characterization algorithms [8,9]. Radiomics provides a noninvasive method for the prediction of LN metastasis. At present, there are few studies on predicting LN metastasis using radiomics [10][11][12][13]. To the best of our knowledge, no studies have determined whether a radiomics signature would enable superior prediction of LN metastasis from PDAC.
Therefore, in this present study, we aimed to develop and validate a radiomics nomogram incorporating a radiomics signature and CT-reported LN status for the preoperative prediction of LN metastasis in patients with PDAC.

Patients
This retrospective single-center study was reviewed and approved by the Biomedical Research Ethics Committee of the Navy Military Medical University of the Chinese People's Liberation Army. Patients were excluded from the study if one of the following criteria was met: patients who had not undergone preoperative standard contrast-enhanced MSCT, had not undergone enhanced MSCT within a month before surgery, had received any treatment (radiotherapy, chemotherapy or chemoradiotherapy) before undergoing imaging studies, had not undergone surgery, were not diagnosed with PDAC by both hematoxylin and eosin (HE) staining and immunohistochemistry, had pathologically confirmed PDAC with mixed differentiation, had pancreatic lesions that could not be visualized by MSCT, had other tumors in the pancreas, or lacked preoperative serum carbohydrate antigen 19-9 (CA 19-9) concentration. Consequently, a total of 225 consecutive patients with PDAC, 137 males (mean age, 60.02 years; age range, 31-77 years) and 88 females (mean age, 63.28 years; age range, 32-80 years), were included in this cross-sectional study at our institution. A flowchart of the study population is presented in Fig. 1. We divided the patients into two independent cohorts. One hundred eighty consecutive patients constituted a training cohort of 107 males (mean age, 59.19 years; age range, 31-75 years) and 73 females (mean age, 62.59 years; age range, 32-75 years). Data were gathered from records between January 2014 and January 2017. Forty-five consecutive patients constituted a validation cohort of 30 males (mean age, 63.00 years; age range, 45-77 years) and 15 females (mean age, 64.13 years; age range, 46-80 years). Data were gathered from records between February 2017 and December 2017.

CT scanning
A 640-slice CT scanner (Aquilion ONE, Canon Medical Systems, Tokyo, Japan) was used with the following CT scan parameters: 120 kV, 150 effective mAs, beam collimation of 160 × 0.5 mm, a matrix of 350 × 350, and a gantry rotation time of 0.5 s. After nonenhanced CT scanning, dynamic contrast-enhanced CT scanning was performed. The scan delayed time was determined according to the test bolus. The contrast agent, 90-95 mL of 370 mgI/mL iopromide (Ultravist 370, Bayer Healthcare, Berlin, Germany), was injected at a rate of 5.5 ml/sec with a power injector (Medrad Mark V plus, Bayer, Leverkusen, Germany) via the forearm vein, followed by 98 ml of normal saline to flush the tube. Arterial (20-25 s), portal venous (60-70 s), and delayed-phase (110-130 s) scans were performed after contrast agent injection. The slice thickness/intervals of the scan and reconstruction were 0.8/1.0 mm and 1.0/1.0 mm, respectively. The scanning range was from the level of the diaphragm to the level of the pelvis.

Imaging analysis
All CT images were analyzed by two board-certified abdominal radiologists (W.L. and F.X., with 30 and 5 years of experience, respectively) who were aware that the study population had PDAC but were blinded to the clinical and pathologic details.
All tumors were evaluated for the following 6 features: (a) Tumor location was defined as in the head, body, or tail of the pancreas or in multiple locations in the pancreas. (b) Tumor size was defined as the maximum diameter of a cross-section of the tumor [14]. (c) CTreported LN metastasis was considered if one of the following six criteria was met: short-axis diameter of a LN > 10 mm, nonuniform density, nonuniform enhancement, internal necrosis, LN fusion, ill-defined borders, or involvement of surrounding organs or blood vessels [15,16]. (d) Organ invasion was defined as involvement of the liver, spleen, intestines, or stomach in which the tumor could not be separated from the organs. (e) Vascular invasion was defined as invasion of the common hepatic artery, splenic artery and vein, gastroduodenal artery, superior mesenteric artery and vein, or portal vein. The criteria for vascular invasion included vessel occlusion or stenosis or tumor contacting more than half of the perimeter of the vessel.

Radiomics workflow
The radiomics workflow included (a) image segmentation, (b) feature extraction, (c) feature reduction and selection, and (e) predictive model building (Fig. 2). In this study, radiomics features were extracted from arterial CT scans. The draw tool available in the Editor module of 3D Slicer version 4.8.1 (open source software: https://www.slicer.org/) was used to delineate the tumors in multiple slices. In this study, the volume of interest was extracted by stacking the corresponding regions of interest (ROIs) delineated slice-by-slice for each patient. Image preprocessing can be found in Supplementary 1.
Radiomics feature extraction was conducted using an open source Python package, PyRadiomics 1.2.0 (http:// www.radiomics.io/pyradiomics.html) [17]. The feature extraction methods used in this study included two categories: original feature classes and filter classes. The filter classes further included five categories: wavelet, square, square root, logarithm, and exponential. A total of 1029 2D and 3D features from primary tumors in the arterial phase were extracted and divided into five groups: (a) first-order statistics, (b) shape features, (c) gray-level cooccurrence matrix (GLCM) features, (d) gray-level size zone matrix (GLSZM) features, and (e) gray-level run-length matrix (GLRLM) features. More information about the procedures for image segmentation and radiomics feature extraction is reported in Supplementary 2.
To assess interobserver reliability, the ROI segmentation was performed in a blinded fashion by two radiologists: reader 1 (W.L.) and reader 2 (F.X.). To evaluate intraobserver reliability, reader 1 repeated the feature extraction 3 times at the interval of 1 week. Reader 1 completed the remaining image segmentations, and the readout sessions were conducted over a period of 1 month. The reliability was calculated by using intraclass correlation coefficients (ICCs). Radiomics features with both interobserver and intraobserver ICC values greater than 0.75 (indicating excellent stability) were selected for subsequent investigation.
As the radiomics features were very high-dimensional compared with the sample size, the least absolute shrinkage and selection operator (LASSO) logistic regression algorithm, suitable for performing regression analysis of high-dimensional data, was used to select the most useful associated features [18]. The LASSO logistic regression model was used with penalty parameter tuning that was conducted by 5-fold cross-validation based on minimum criteria. The radiomics score (rad-score) was calculated for each patient via a linear combination of selected features weighted by their respective coefficients. More information about feature selection can be found in Supplementary 3.

Development, performance, and validation of a Radiomics model
Multivariable logistic regression analysis was conducted to develop a model for predicting LN metastasis in the primary cohort. To provide a more understandable outcome measure, a nomogram was then constructed by using the selected covariates. The discrimination performance of established models was quantified by the receiver operating characteristic (ROC) curve, which was constructed using Bootstrap resampling (times = 500), and the area under the curve (AUC) value [19]. AUC estimates in the prediction models were compared by using the Delong nonparametric approach [20]. Calibration curves was plotted via bootstrapping with 500 resamples to assess the calibration of the radiomics model, accompanied by the Hosmer-Lemeshow goodness-of-fit test. The performance of the radiomics model was then internally tested in an independent validation cohort by using the formula derived from the primary cohort.

Clinical utility of the Radiomics nomogram
To estimate the clinical utility of the nomogram, decision curve analysis (DCA) was performed by calculating the net benefits for a range of threshold probabilities (Supplementary 4).

Pathological image analysis
All the specimens were analyzed by a specialized pathologist. Pathological examinations and analyses were standardized according to a formal protocol [21]. The resected specimens were immediately fixed in formalin for 24 h. Subsequently, they were cut horizontally into 5mm tissue blocks that were dehydrated and embedded in paraffin. Finally, 5-μm sections were stained with HE for conventional histology. Each large section was carefully examined by light microscopy. Tumor-node-metastasis (TNM) staging was performed on the basis of the American Joint Committee on Cancer TNM Staging Manual, 8th Edition [15].

Statistical analysis
Normal distribution and variance homogeneity tests were performed on all continuous variables. Continuous variables with a normal distribution are expressed as mean ± SD; variables with a non-normal distribution are expressed as the median and interquartile range. The rad-score was expressed as ten times. First, we examined group differences in terms of age, gender, body mass index (BMI), CA 19-9 level, tumor location, tumor (T) grade, differentiation grade, and the rad-score between the LN-positive and LN-negative patients. Student's t test (normal distribution), the Kruskal-Wallis H (skewed distribution) test, and the chi-square test (categorical variables) were used to identify significant differences between the two groups. Second, patients were categorized into quartiles (Q1 < -0.45, Q2 [− 0.45 to − 0], Q3 [0 to 0.46], and Q4 ≥ 0.46) on the basis of the rad-score, with Q1 as the reference group. Univariate regression analysis was applied to estimate the effect sizes between all variables and LN metastasis. Variables that reached statistical significance in the univariable analysis were considered for the multivariable model.
A two-tailed p-value less than 0.05 was considered statistically significant. All analyses were performed with R (R version 3.3.3; R Foundation for Statistical Computing; http://www.r-project.org) and EmpowerStats (X&Y Solutions, Inc., Boston, MA, USA).

Clinical characteristics
The LN-negative and LN-positive patients accounted for 47.56% (107) and 52.44% (118) of the study cohort, respectively. There was a significant difference in M stage between the LN-positive and LN-negative patients in the training cohort (p = 0.043). However, there were no significant differences in age, gender, CA 19-9 level, T stage, M stage of the validation cohort or differentiation grade (p > 0.05) between the 2 groups. The patient characteristics are shown in Table 1.

Tumor MSCT features
Among various CT findings, tumor size in the validation cohort and CT-reported LN status in the training cohort differed significantly between the LN-negative and LNpositive patients. However, there were no significant differences in tumor location, tumor size in the training cohort, vascular invasion, organ invasion, and CT-reported LN status in the validation cohort between the LNnegative and LN-positive patients (Table 1).

Radiomics analysis
A total of 1029 radiomics features from the arterial phase of CT were extracted and grouped on the basis of LN metastasis. We removed the 480 radiomics feature with ICC values < 0.75. The interobserver ICCs of the 549 radiomics features were good, ranging from 0.80 to 0.91. The intraobserver ICCs of the 549 radiomics features were also good, ranging from 0.86 to 0.92. Next, the radiomics features that did not significantly different between the groups or did not show significant correlations with LN-positive/negative was excluded. The 24 remaining radiomics features were further reduced using a LASSO logistic regression model. Finally, the radiomics characteristics were reduced to 13 features. Finally, the radiomics signature was constructed, and the radiomics scores was calculated by using the following formula 1.
There was a significant difference in arterial rad-score between the LN-positive and LN-negative patients (p = 0.002) (Supplementary 5  Univariate analysis of each parameter The univariate analysis results are shown in Table 2. The rad-score (p < 0.0001) and CT-reported LN status (p = 0.014) were significantly associated with an increased risk for LN metastasis.

Development, performance, and validation of prediction models
Logistic regression analysis identified the rad-score and CT-reported LN status as independent predictors (Table  3). A model that incorporated these two independent predictors was developed and presented as a nomogram (Fig. 3A). In addition, the prediction model was built based on CT-reported LN status. All ROC curves are provided in Fig. 3B and 3C. In the primary cohort, the radiomics model showed the highest discrimination between LNs that were positive and negative for metastasis, with an AUC of 0.75 (95% CI: 0.68, 0.82); the observed AUC value was higher than that for CT-reported LN status (AUC, 0.59 [95% CI: 0.51, 0.67]; p < 0.0001). In the validation cohort, the radiomics model yielded the greatest AUC (0.81; 95% CI: 0.69, 0.94), which confirmed that the radiomics model achieved better predictive efficacy than CT-reported LN status (AUC, 0.63 [95% CI: 0.46, 0.80]; p = 0.02). In the radiomics model, the sensitivity, specificity, and accuracy for the training cohort were 67.68, 75.31%, and 0.711, respectively, whereas those for the validation cohort were 84.21, 69.23%, and 0.756, respectively. The calibration curve of the radiomics nomogram demonstrated good agreement between predicted and observed LN metastasis in the primary cohort (Fig. 3D). The Hosmer-Lemeshow test yielded a p-value of 0.88, suggesting no departure from the good fit. The favorable calibration of the radiomics nomogram was further confirmed in the validation cohort (Fig. 3E). The Hosmer-Lemeshow test yielded a p-value of 0.63, suggesting a perfect fit of the nomogram in the validation set.

Clinical use
The DCA in the validation set showed that if the threshold probability is between 0.25 and 0.75, using the radiomics nomogram in the current study to predict LN metastases adds more benefit than the treat-all-patients scheme or the treat-none scheme (Fig. 4).

Discussion
We developed and validated a diagnostic, radiomics signature-based nomogram for the preoperative individualized prediction of LN metastasis in patients with PDAC. The nomogram incorporates two items, the radscore and CT-reported LN status. Incorporating these two factors into an easy-to-use nomogram facilitates the preoperative individualized prediction of LN metastasis.
PDAC is characterized by an extremely high mortality and a poor prognosis, which are largely attributed to difficulties in early diagnosis and limited therapeutic options. The number of positive LNs has been shown to be a crucial and independent prognostic factor for overall survival in PDAC [22]. Pancreatectomy is the most effective method to improve long-term patient survival. Whether pancreatectomy should include standard and extended lymphadenectomy is still debated [23,24]. Accurate preoperative LN staging of PDAC is essential for providing patients with appropriate counsel regarding surgical decisions and prognosis. However, it is difficult with the currently available methods.
TH high-risk patients should consider neoadjuvant therapy if LN metastases are confirmed by endoscopic ultrasonography-guided FNA (EUS-FNA) [25]. EUS-FNA is considered quite sensitive for the detection of pancreatic lesions and offers diagnostic value for both the primary tumor and LN metastases [4,5]. A piece of tissue that can provide sufficient histological information to help diagnose peripancreatobiliary LN involvement can be obtained with EUS-FNA. For FNA of LNs, suction is not recommended to reduce blood contamination [26]. In addition, EUS-FNA is affected by various factors, such as scope position [27], lesion characteristics, the environment surrounding the lesions, and the evaluating pathologist [27][28][29][30]. Positron emission tomographycomputed tomography (PET/CT) is limited in its ability to evaluate small lesions and cannot differentiate between inflammatory lymphadenopathy and metastatic lymphadenopathy [31]. Similarly, MRI has several limiting factors associated with the determination of LN status in clinical settings, namely, spatial resolution problems, motion artifacts, and dose-dependent oversaturation artifacts [6]. The most widely used preoperative  evidence of LN metastasis. Second, CT has limited visualization ability to identify metastatic LNs. Finally, there is no significant correlation between LN metastasis and the clinical and pathologic characteristics of PDAC patients. In addition, local inflammation secondary to malignant biliary obstruction may independently result in enlarged LNs [38]. In the current study, we found no significant correlation between LN metastasis and CT-reported tumor size or vascular or organ invasion. Thus, improved predictive tools for preoperative LN staging are urgently needed. In our study, the arterial radiomics signature was significantly associated with LN status (p < 0.05 for both the training and validation cohorts).
At present, there are few studies on predicting LN metastasis using radiomics. Wu et al. [10] developed and validated a radiomics nomogram that incorporated the radiomics signature and CT-reported LN status and showed good calibration and discrimination in a training set (AUC, 0.9262; 95% CI, 0.8657-0.9868) and in a validation set (AUC, 0.8986; 95% CI, 0.7613-0.9901). Huang et al. [12] developed and validated a radiomics nomogram that included the radiomics signature, carcinoembryonic antigen (CEA) level and CT-reported LN status, and the prediction model yielded C-indexes of 0.736 (95% CI, 0.730 to 0.742) in the training cohort and 0.778 (95% CI, 0.769 to 0.787) in the validation cohort. A nomogram incorporating some clinical and pathological factors to predict the prognosis of PDAC has been reported [39,40]. However, it was difficult to incorporate the radiomics signature, imaging findings and clinical factors to predict LN metastasis from PDAC. In the current study, the rad-score and CT-reported LN status were incorporated into an easy-to-use nomogram to facilitate the preoperative individualized prediction of LN metastasis. Our nomogram performed well in both the training (AUC, 0.75; 95% CI, 0.68-0.82) and validation cohorts (AUC, 0.81; 95% CI, 0.69-0.94). Our nomogram also showed good calibration in both the training and validation cohorts.
To go beyond the purely mathematical measures of performance, such as the AUC, DCA was used to estimate the predicted net benefit of the model across all possible risk thresholds, thus making it easier to evaluate the effects of various risk thresholds [41,42]. DCA showed that if the threshold probability is between 0.25 and 0.75, the current radiomics nomogram to predict LN metastases added more benefit than either the treatall or treat-none scheme.
The current study has some limitations. The ROC value in the training cohort was lower than in the validation cohort. The study was lack of external validation of the model. Multicenter validation with a larger sample size is needed to acquire high-level evidence for clinical application. In addition, genetic markers have not yet Fig. 4 DCA for the rad-score. DCA for the radiomics nomogram. The y-axis represents the net benefit. The red line represents the radiomics nomogram. The gray line represents the hypothesis that all patients had LN metastases. The black line represents the hypothesis that no patients had LN metastases. The x-axis represents the threshold probability, which is where the expected benefit of treatment is equal to the expected benefit of avoiding treatment. The decision curves in the validation set showed that if the threshold probability is between 0.25 and 0.75, the radiomics nomogram developed in the current study to predict LN metastases adds more benefit than the treat-all or treat-none scheme been incorporated into our nomogram. Previous studies have showed that Smad4/DPDAC4 and MTA1 mRNA expression levels may be involved in the progression of PDAC, particularly in LN metastasis [43,44]. A combination of gene marker panels and a radiomics signature may improve the ability to predict LN metastasis in patients with PDAC.

Conclusion
Our radiomics nomogram, which is a noninvasive predictive tool that combines a radiomics signature with CT-reported LN status, shows favorable accuracy for preoperatively predicting LN metastasis in PDAC patients, especially in LN-positive patients.